BioWeka - extending the Weka framework for bioinformatics
نویسندگان
چکیده
UNLABELLED Given the growing amount of biological data, data mining methods have become an integral part of bioinformatics research. Unfortunately, standard data mining tools are often not sufficiently equipped for handling raw data such as e.g. amino acid sequences. One popular and freely available framework that contains many well-known data mining algorithms is the Waikato Environment for Knowledge Analysis (Weka). In the BioWeka project, we introduce various input formats for bioinformatics data and bioinformatics methods like alignments to Weka. This allows users to easily combine them with Weka's classification, clustering, validation and visualization facilities on a single platform and therefore reduces the overhead of converting data between different data formats as well as the need to write custom evaluation procedures that can deal with many different programs. We encourage users to participate in this project by adding their own components and data formats to BioWeka. AVAILABILITY The software, documentation and tutorial are available at http://www.bioweka.org.
منابع مشابه
Data mining in bioinformatics using Weka
UNLABELLED The Weka machine learning workbench provides a general-purpose environment for automatic classification, regression, clustering and feature selection-common data mining problems in bioinformatics research. It contains an extensive collection of machine learning algorithms and data pre-processing methods complemented by graphical user interfaces for data exploration and the experiment...
متن کاملComparative Analysis of Data Mining Tools and Classification Techniques using WEKA in Medical Bioinformatics
The availability of huge amounts of data resulted in great need of data mining technique in order to generate useful knowledge. In the present study we provide detailed information about data mining techniques with more focus on classification techniques as one important supervised learning technique. We also discuss WEKA software as a tool of choice to perform classification analysis for diffe...
متن کاملWekaBioSimilarity - Extending Weka with Resemblance Measures
The classification of organisms is a daily-basis task in biology as well as other contexts. This process is usually carried out by comparing a set of descriptors associated with each object. However, general-purpose statistical packages offer a limited number of methods to perform such a comparison, and specific tools are required for each concrete problem. Weka is a freely-available framework ...
متن کاملData Mining in Educational System using WEKA
Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential used in various commercial applications including retail sales, e-commerce, remote sensing, bioinformatics etc. Education is an essential element for the progress of country. Mining in educational environment is called Educational Data Mining. Educational data min...
متن کاملFlexDM: Enabling robust and reliable parallel data mining using WEKA
Performing massive data mining experiments with multiple datasets and methods is a common task faced by most bioinformatics and computational biology laboratories. WEKA is a machine learning package designed to facilitate this task by providing tools that allow researchers to select from several classification methods and specific test strategies. Despite its popularity, the current WEKA enviro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 23 5 شماره
صفحات -
تاریخ انتشار 2007